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Abstract — We consider optimal distributed controller syn- 
thesis for an interconnected system subject to communication 
constraints, in linear quadratic settings. Motivated by the 
problem of finite heavy duty vehicle platooning, we study 
systems composed of interconnected subsystems over a chain 
graph. By decomposing the system into orthogonal modes, the 
cost function can be separated into individual components. 
Thereby, derivation of the optimal controllers in state-space 
follows immediately. The optimal controllers are evaluated 
under the practical setting of heavy duty vehicle platooning with 
communication constraints. It is shown that the performance 
can be significantly improved by adding a few communication 
links. The results show that the proposed optimal distributed 
controller performs almost as well as the centralized linear 
quadratic Gaussian controller and outperforms a suboptimal 
controller in terms of control input. Furthermore, the control 
input energy can be reduced significantly with the proposed 
controller compared to the suboptimal controller, depending 
on the vehicle position in the platoon. Thus, the importance of 
considering preceding vehicles as well as the following vehicles 
in a platoon for fuel optimality is concluded. 

I. INTRODUCTION 

The systems to be controlled are, in many application 
domains, getting larger and more complex. When there 
is interconnection between different dynamical systems, 
conventional optimal control algorithms provide a solution 
where centralized state information is required. However, 
it is often preferable and sometimes necessary to have a 
decentralized controller structure, since in many practical 
problems, the physical or communication constraints often 
impose a specific interconnection structure. Hence, it is 
interesting to design decentralized feedback controllers for 
systems of a certain structure and examine their overall 
performance. 

The control problem in this paper is motivated by systems, 
generally referred to as vehicle platooning, involving a chain 
of closely spaced heavy duty vehicles (HDVs). Information 
technology is paving its path into the transport industry, 
enabling the possibility of automated control strategies. Gov- 
erning vehicle platoons by an automated control strategy, the 
overall traffic flow is expected to improve [11] and the road 
capacity will increase significantly [8]. With radar sensors, 
each vehicle is able to measure the relative distance and 
velocity of the preceding vehicle. The radar measurements 
are conveyed further down the chain of vehicles through 
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wireless communication. By traveling at a close intermediate 
spacing, the air drag is reduced for each vehicle in the 
platoon. Thereby, the control effort and inherently the fuel 
consumption can be reduced significantly. However, as the 
intermediate spacing is reduced the control becomes tighter 
due to safety aspects; mandating an increase in control action 
through additional acceleration and braking. Hence, it is of 
vast interest for the industry to find a fuel optimal control. 
Thus, with limited information and control input constraints, 
the control objective is to maintain a predefined headway 
to the vehicle ahead based upon local state measurements, 
which makes it a decentralized control problem. 

Decentralized control problems are still intractable in gen- 
eral. One approach has been to classify specific information 
patterns leading to linear optimal controllers. In [22], suffi- 
cient conditions are given under which optimal controllers 
are linear in the linear quadratic setting. An important result 
was given in [10] which showed that for a new information 
structure, referred to as partially nested, the optimal policy 
is linear in the information set. In [12], stochastic linear 
quadratic control problem was solved under the condition 
that all the subsystems have access to the global information 
from some time in the past. [5], showed that the constrained 
linear optimal decision problem for infinite horizon linear 
quadratic control, can be posed as an infinite dimensional 
convex optimization problem, given that the considered sys- 
tem is stable. Control for chain structures in the context 
of platoons has been studied through various perspectives, 
e.g., [4], [6], [13], [3], [17], [18], [20]. It has been shown 
that control strategies may vary depending on the available 
information within the platoon. However, communication 
constraints have not in general been considered in control 
design for platooning applications. 

The aim of this study is to synthesize controllers for a 
practical decentralized system composed of M interacting 
systems over a chain. We minimize a quadratic cost under the 
partially nested information structure. This problem is known 
to have a linear optimal policy, [10] and [21]. However, most 
existing approaches do not provide explicit optimal controller 
formulae and, the order of the controllers can be large [9], 
which makes the implementation difficult. Some work has 
been focused on finding numerical algorithms to these prob- 
lems, [15] and [24]. Recently, state-space solutions to the so- 
called two-player state-feedback H 2 version of this problem 
have been given in [19]. Also, in [16], using concepts 
from order theory, a control architecture has been proposed 
for systems having the structure of a partially ordered set. 
In contrast, we construct conditional estimates based on 



the information shared among the controllers. Thereby, we 
show how to decompose the states, control inputs, and as a 
result, the cost function into independent terms. Having the 
cost function decomposed into individual pieces, analytical 
derivation of the optimal controllers follows immediately. 

The main contribution of this paper is to introduce a simple 
decomposition scheme to construct optimal decentralized 
controllers with low computational complexity for chain 
structures which is applicable to intelligent transportation 
systems in terms of automated platooning. Derived from the 
characteristics of actual Scania HDVs, we present a discrete 
system model that includes physical coupling with a preced- 
ing vehicle. In the context of HDV platooning, we explicitly 
study systems composed of two and three interconnected 
subsystems over a chain structure. The proposed control 
scheme accounts for a constrained communication pattern 
among the vehicles and hence reduces the communications 
compared to a centralized information pattern where full state 
information is available to each controller. We also evaluate 
the performance of the optimal controllers for a typical sce- 
nario in HDV platooning under normal operating conditions, 
with respect to the imposed information constraints. 

The outline of the remainder of this paper is as follows. 
First we specify the problem that we are considering in 
Section [II] Then, the finite and infinite horizon optimal 
controller formulation for the simplest case, the two-vehicle 
problem, will be presented in Section III In Section IV we 



will show how the decomposition scheme can be extended to 
the case of three interconnected subsystems. We apply the 
three-vehicle optimal distributed controller to the example 
of HDV platooning in Section [V] where we evaluate the pro- 
posed controller in comparison with the optimal centralized 
controller and a suboptimal decentralized controller. 

Notation. We denote a matrix partitioned into blocks by 
A = [Aij], where Aj,j denotes the block matrix of A in block 
position The submatrix of A formed by row partitions 

i through j and column partitions k through I will be denoted 
by A[i : j,k : I}: 
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The expected value of a random variable x is denoted by 
E{x}. The conditional expectation of x given y is denoted 
by E{a;|2/}. The trace of a matrix A is denoted by Tr{A}, 
and the sequence x(0), x(l), x(t), is denoted by x(0 : t). 

II. System Model and Problem Statement 

In this section we present the physical properties of the 
system that we are considering. We state the nonlinear 
dynamics of a single vehicle and the model for the aero- 
dynamics, which induces the physical coupling. Then we 
present the linear discrete system model for a heterogeneous 
HDV platoon and its associated cost function. Finally, the 
problem formulation is given. 



Fig. 1. The figure shows a platoon of M heavy duty vehicles, where each 
vehicle is only able to communicate with the preceding vehicles. 
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Fig. 2. The empirical air drag coefficient cjj as a function of the 
intermediate spacing d. Adapted from [23]. Similar findings are found in 
[7]. 



A. System Model 

We consider an HDV platoon as depicted in Figure [T] The 
state equation of a single HDV is modeled as [14], 



s = v, 

^TitV ^engine ^brake ^airdragi.^) 
— k u U kfoFbrake kdV 



(1) 



brake 

k f r cos a — kg sin < 



where v is the vehicle velocity, m t denotes the acceler- 
ated mass and « £ 1 denotes the net engine torque. 
k u , k},, kd, kf r , and k g denote the characteristic vehicle and 
environment coefficients for the engine, brake, air drag, road 
friction, and gravitation respectively. 

The aerodynamic drag has a strong impact on an HDV, 
since it can amount up to 50 % of the total resistive forces at 
full speed. When traveling at short intermediate spacings, the 
wind resistance is reduced significantly. Hence, a physical 
coupling is induced between each vehicle in a platoon. 
To account for the aerodynamics the air drag characteristic 
coefficient in ([T| can be modeled as 
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where Q(d) = Kid + k%, < d < 65 is the longitudinal rel- 
ative distance between two vehicles, and m, K2 are adjusted 
according to the graphical model given in Figure [2] 

The velocities do not deviate significantly for the vehicles 
with respect to the lead vehicle's velocity in an automated 
HDV platoon. Thus, a linearized model should give a suffi- 
cient description of the system behavior. By linearizing and 



applying a one step forward discretization to ([TJ, the discrete 
model with respect to a set reference velocity, an engine 
torque which maintains the velocity, a fixed spacing between 
the vehicles, and a constant slope is hence given by 



x(t + 1) = Ax(t) + Bu{t) + w(t), 



(2) 
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where 5, denotes the physical coupling with a preceding 
vehicle and T s is the sampling time. The derived HDV 
platoon model in ([3]) has a lower block triangular structure, 
which can generally be stated as 
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where the corresponding vehicle states for each subsystem 
are 



Vi 



i = 2,...,M. 



B. Performance Criteria 

The performance criteria of an HDV platoon can be 
mapped into quadratic costs. Hence, we formulate the weight 
parameters for a quadratic cost function based upon perfor- 
mance and safety objectives. The objective of the lead vehicle 
is to minimize the fuel consumption and control input, while 
maintaining a set reference velocity. The objective of the fol- 
lower vehicles in addition, is to follow the preceding vehicles 
velocity, while maintaining a set intermediate spacing. The 
intermediate spacing reference could be constant or, as in 
this case, time varying. It is determined by setting a desired 
time gap r s, which in turn determines the spacing policy as 

d re f(t) = rv(t). 

Thereby, the vehicles will maintain a larger intermediate 
spacing at higher velocities. Hence, the weights for an M 
HDV platoon can be set up as 

N-l M 

J(u*)=min ^ (y^Wj(d(i-i)i{t) - TVj(t)) 2 

t=Q i=2 

+ W { V (Vi-l(t) - Vi(t)) 2 
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Ri = Wi 



The weights in |5]) give a direct interpretation of how to 
enforce the objectives for a vehicle traveling in a platoon. 
The value of wj determines the importance of not deviating 
from the desired time gap. Hence, a large wj puts emphasis 
on safety. w^ v creates a cost for deviating from the velocity 
of the preceding vehicle, and w?* punishes the control 
effort which is proportional to the fuel consumption. The 
following terms, wf, w\, put a cost on the deviation from the 
linearized states. Note that the main objective is to maintain 
a set intermediate distance, while maintaining a fuel efficient 
behavior. Therefore, wj , w^ v and w^' must be set larger than 
the remaining weights. The weights are chosen such that Q 
is positive semidefinite and R is positive definite. 



(4) 

C. Problem Formulation 



Although the approach used in this paper is applicable for 
systems over general acyclic graphs, for simplicity we will 
concentrate on two simple chain structures, which we refer 
to as two- and three- vehicle chains. The aim is to synthesize 
controllers under imposed communication constraints. 



For the two-vehicle chain the system matrices have the 
sparsity structure as 
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(7) 



Assume {w(t)} is a sequence of mutually independent 
Gaussian vectors with zero mean values and covariance given 
by 

~Wi 
W 2 



E{w(k)w T (l)} = 



5(k-l). 



It is assumed that x(0) = 0. 

In this system, the dynamics of subsystem 1 (Vehicle 
1) propagates to subsystem 2 (Vehicle 2) but not vice- 
versa. If both subsystems have access to the global state 
measurements the information structure would be classical, 
and the optimal linear controller could be obtained from the 
linear quadratic control theory. However, in the practical 
setting of HDV platooning the lead vehicle only has its 
own state information, whereas the follower vehicle can 
also measure the states of the preceding vehicle through 
radar sensors. Therefore, we consider the case in which 
U2 has access to the overall measurement history, while 
Ux has access to its own measurements. Let I* denote the 
information set of controller i at time t. Then 



lS={ii(0:i)}, 4 = {»(0:t)}. 



(8) 



This information pattern is not classical anymore and is a 
simple case of a partially nested information structure. This 
is one of a few non-classical information patterns for which 
the optimal policy is known to be unique and linear in the 
information set. For the chain of three vehicles, the matrices 
are given by 
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Here, {w(t)} is a Gaussian disturbance vector with covari- 
ance given by 
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To maintain partially nestedness, the information set for the 
controllers is given by 

I[ = {xx(0 : t)}, It, = {xx{0 : t),x 2 (0 : t)}, 



% ={ Xl (0: t),x 2 {0: t),x 3 (0 : *)}. 



(10) 



where only one communication link is needed from vehicle 1 
to vehicle 3, since vehicle 2 and 3 can measure the preceding 
vehicle states with on-board radar sensors. 

Thus, the problem that we solve is finding an analytical 
formulation for optimal controllers constrained to specified 
information sets that minimize the infinite-horizon quadratic 
cost 
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subject to the given system dynamics and performance 
objectives. We first give an explicit solution for the two- 
vehicle problem defined by (|7]i and (|8), where the intuition 
behind the solution is derived. To show how the proposed 
technique can be applied to more general chains, we then 
present an explicit solution for the three-vehicle problem 
with dynamics given in Q subject to constraints in ( [T0| >. 

III. TWO- VEHICLE CHAIN 

The aim of this section is to present the optimal control 
synthesis for the simplest case of the problem which is a 
chain of two vehicles. The derivation given in this section 
explains the decomposition idea and the structure of the 
controllers. First, we shall present the optimal controller in 



Section III-A Next, the derivation of the time-varying and 
the stationary controller will be explained in Section 
Finally, we conclude with some remarks in Section p 



III-B 



A. Main Result 

Theorem 1: Assume that 

i) (A, B) is stabilizable, 

ii) (^22, B2) is stabilizable, 

iii) (Q, A) is detectable, 

iv) (Q22- ^22) is detectable. 

Then, the optimal controller for the two-vehicle chain is 
given by: 



r)(t + 1) = (A22 - B 2 L 2 2)v(t) + [A21 ~ B2L21 0] 
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and the optimal cost is 

Tr(XiiT^i) + Tr(yi^ 2 ). 

The matrices X and Y are the positive semidefinite stabiliz- 
ing solutions to the Riccati equations 

X = A T XA + Q- A T XB{B T XB + R)- 1 B T XA, 

Y = Al 2 YA 22 + Q22 - Al 2 YB 2 {B^YB 2 + R 2 2y 1 B^YA 22 , 

and the matrix X is partitioned into blocks compatible with 
the partitions of A: 

X = [X ij ], i,j = 1,2. 

The gain matrices L 1 and L 2 are given by 

L 1 = (R + B T XB)- 1 B T XA, 
L 2 = (R22 + B 2 r YB 2 )- 1 B 2 r YA 2 2, 

and L 1 is partitioned into blocks according to 

L 1 = [£«], i, j = l,2. 

Before giving the proof of the theorem, we need to state 
the following lemma and corollary. 



Lemma 1: Consider the system described by Q, we 
introduce the following Riccati equation 

P(t) =A T P{t + I) A + Q- A T P(t + l)Bx 
(B T P{t + l)B + R)- 1 B T P(t + 1)A, 

for t = 0, . . . , N, with the end condition P(N) = Q, where 
Q is positive semidefinite. Then, 
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where L{t) is given by 

L(t) = (R + B T P(t + l)B)- 1 B T P(t + 1)A. 
Proof: See for example [2]. ■ 
Corollary 1: Assume that x(0) — 0, {w(t)} is a sequence 
of uncorrected Gaussian variables with the covariance W, 
and w(t) is independent of x(t) and u(t). Then, 

N-l 

F,{x T '(N)Qx(N) + ^2 (x T (t)Qx(t) + u T (t)Ru(t))} = 
t=o 

N-l 



E{ Y ( u (t) + L(t)x(t) f(B T P{t + l)B + R) : 
t=o 

N-l 

(u(t) + L(t)x(t))} + Tr(P(* + l)W). 



where L(t) and P(t) are given in Lemma [T| 

B. Optimal Controller Derivation 

Based on the information constraints in (|SJ, we want to 
find the controllers restricted to the following structure: 



ui(t) = /i(a:i(0:t)), 
u 2 (t) - f 2 (x(0:t)), 



(12) 



where /,-, i = 1,2, denote linear functions in their arguments. 

To derive the optimal controller, we will first consider a 
finite-horizon version of the problem with the cost function 
given by 

N-l 

J = E x T (N)Qx(N) + E (x T (t)Qx(t) + u T (t)Ru(t)). 

t=o 

To find a structure for the controllers, we decompose the 
state variable into two independent terms as 

x(t) = z 1 {t) + z 2 {t), 

where z 1 ^) := E{x(t)\xi(0 : t)}, and z 2 (t) := x(t) - 
The term z 1 is the conditional estimate of x given the 



information shared between the controllers, namely Xi(0 : t), 
and z 2 is the estimation error. Let these vectors be partitioned 



as z l (t) = 



1, 2. Clearly, the first component of 



z (t) is xi(t). Hence 

z\t) 



x x {t) 

4(t). 



z 2 {t) 



Analogously, the control input is decomposed as u(t) — 
u 1 (t)+u 2 (t), where u 1 and u 2 are independent terms defined 
by 

it 1 ^) := E{u(i)|xi(0 : t)}, u 2 (t) := u(t) - u 1 ^). 

Lemma 2: The update equations for z 1 and z 2 are given 
by: 

~ Wl (t) 



z\t + 1) = Az 1 ^) + Bu 1 ^) + 

z 2 (t + 1) = Az 2 (t) + Bu 2 {t) + 
Proof: See Appendix. 
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Now, considering u(t) on the form given by (12i we find 
that 

u 1 ^) = E{u(t)|xi(0 : t)} 

'fi(MO-t)) 
J 2 (x(0:t))_ 

'h(x 1 (0:t)) 
' /2(* X (0:t)). ' 
where the last equality follows from the fact that E{/2(x(0 : 
*))|a!i(0 : t)} = f 2 (V{x(0 : t)\ Xl (0 : t) = f 2 {z\Q : t))}). 
Thus, u 2 has the structure 
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By partitioning these vectors as u % = 

be seen that u\(t) = 0, so the control input for subsystem 
1 is given as the first component of the vector u 1 , while 
subsystem 2's input is separated into the two independent 
terms, namely u\, and u|. In other words, we have 
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Decomposition of the states and inputs into independent 
terms and having u 1 and u 2 given as functions of z 1 and 
z 2 (which are independent terms) implies that the vectors 



z\t) 
u\t) 



and 



z\t) 
u 2 (t) 



are independent. As a result, J can be 



decomposed as: 
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Note that having z\ and u\ equal to zero implies that only 
the second component of z 2 is nonzero. The dynamics for 
this component can be written as 

4(t + 1) = A 22 z 2 2 {t) + B 2 u 2 2 {t) + w 2 (t). 

Noting that Wi(t) is independent of z l (t), u l (t), we can apply 
Corollary [T] to transform J\ and J 2 : J = 

N-l 

+ L 1 z 1 (t)) T (B T X{t + l)B + R)x 
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where we also used a;(0) = 0. The matrices X(t) and Y(t) 
are computed recursively by 

X(t) =A T X(t + 1)A + Q - A T X(t + l)Bx 
(B T X(t + 1)B + R)- 1 B T X(t + 1)A, 
Y(t) =A T 22 Y{t + l)A 22 + Q 22 - A 22 Y(t + l)B 2 x 



(fl|Y(t + l)B 2 + R 22 y 1 B 2 Y(t + l)A 22 , 

with the end conditions X(N) = Q, Y(N) = Q 22 . The gain 
matrices L 1 and L 2 are given by 

L l (t) = {R + B T X(i + i)B)~ 1 B T X(t + l)A, 
L 2 (t) = {R 22 + B 2 Y(t + l)B 2 y 1 B 2 Y(t + l)A 22 . 

Quadratic minimization of ( fT3j ) simply gives the optimal 
inputs u 1 



as 



u 1 *^) = -L\t)*{t), uf(t) = -L 2 (t)z 2 (t). 

Let X partitioned into appropriately sized blocks, 
[Xij], i,j = 1,2, then the optimal cost becomes, 

N-l 

J* = ^2{Tr(X 11 (t+l)W 1 )+Tr{Y(t+l)W 3 )) 

t=o 

To find a mapping from x to u, let L 1 partitioned into the 
blocks [Lij] so we get the control action u 1 on the form 

ui*(t) = -L n xi(t) - L 12 z\(t), 
u 2*(t) = -L 21 xi(t) - L 22 z\(t), 
and the update equation for z\ becomes 
z\{t + 1) = {A 22 - B 2 L 22 )z\{t) + (A 21 - B 2 L 21 ) Xl (t). 

Finally, noting that z 2 (t) is given by x 2 {t) — z\(t), the 
optimal controller can be rewritten on the form 
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( X2 (t) - 4(t)). 



Having derived the time-varying representation for the con- 
trollers, we now let N go to infinity and obtain the 
steady-state form of the controller. Given the pairs (A, B) 



and (^22,^2) are stabilizable, and the pairs (Q,A) and 
(Q22,A 22 ), are detectable, X(t) and Y(t) converge the 
unique stabilizing solution to corresponding Riccati equa- 
tions and as a result, L x (t) and L 2 (t) will tend to the steady- 
state values L 1 and L 2 given in Theorem 1. This will yield 
the controller representation given in the Theorem. 
Finally, the optimal cost is computed as 

1 JV_1 

lim Tf J2( Tr ( X n( t + +Tr{Y(t+ 1)W 2 )) 

t=0 

= Tr(X n Wx) + Tr(y^ 2 )- 



C. Discussion 

The state vector. 



xi(t) 
x 2 {t) 



is fed into the controller by a 

lower-triangular gain matrix, and hence m is not dependent 
on x 2 . 

Note that z 2 (t) (same variable as rj(t) in Theorem 1) is 
the minimum-mean square estimate of x 2 (t) based on x\ that 
is E{x2(f)|xi(0), xi(t)}. Therefore, z 2 (t) represents the 
error of this estimation. 

For convenience, let ^211 (*) denote the estimate of x 2 (t) 
based on history of x\ and let e 2 |i(t) represent the estimation 
error, then we can write the controllers on a more intuitive 
form: 

u\(t) = -L n xi(t) - L 12 x 2 \i(t), 

u* 2 (t) = -L 21 xx{t) - L 22 x 2 \i{t) - i 2 e 2 |!(t). 

Thus, both controllers use x 2 |i instead of x 2 in the form of an 
optimal centralized control, however, controller 2 contains an 
additional term which is constructed based on the estimation 
error e 2 |i- 

We see that the order of each controller is equal to the 
state dimension of subsystem 2. It is easy to see that in 
a centralized information pattern where the value of x 2 is 
known to controller 1, the error term disappears and the 
controller reduces to a static gain similar to a classical linear 
quadratic regulator problem. 

IV. THREE- VEHICLE CHAIN 

The optimal controller synthesis for the three-vehicle 
version of the problem will be studied here. This section 
extends the result of Theorem 1 to three interconnected 
subsystems. Although the approach is similar, here the infor- 
mation available to the controllers shall be decomposed into 
three components instead of two, and hence the cost function 
will be decomposed accordingly. Since the scheme has been 



explained in detail in Section III a more concise derivation 
will be given here. 

A. Main Result 

Theorem 2: Assume that 

i) (A, B), (A[2:3,2:3],B[2:3,2:3]), and (A 33 ,B 3 ) are 
stabilizable, 

ii) (Q,A), (Q[2 : 3,2 : 3], A[2 : 3,2 : 3]), and {Q 3 3,A 33 ) are 
detectable. 



Then, the optimal controller for the three-vehicle chain is 
given by: 



= {A-BL 1 )^ : 3,1 : 3] 
V3 (t + l) = (A-BL 2 )[2,l:2] 



Ul(t)" 








u 2 (t) 








u 3 {t) 




»»(*) 





a:i(*) 

.»»(*). 
^(i) - J72(*) 
%(*) 


m(t) 



_L 3 (x 3 (t)- m (t)-r} 3 (t)) 

and the optimal cost is 

TrpC^i) + Tfr^Wa) + Tr(X 3 W 3 ). 

The matrices X , X 2 , and X 3 are the positive semidefinite 
stabilizing solutions to the Riccati equations 

X 1 =A T X 1 A + Q- A T X 1 B(B T X 1 B + R)- 1 B T X 1 A 
X 2 =A T X 2 A + Q- A T X 2 B(B T X 2 B + Ry 1 B T X 2 A 
X 3 =A^ 3 X 3 A. 
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f <933 
jT V 3 



— A 33 X B 3 (B 3 X B 3 + R 33 ) B 3 X A 33 

where A = A[2:3,2:3], B = £[2:3,2:3], Q = 
Q[2 : 3, 2 : 3] and R = R[2 : 3, 2 : 3]. The matrix X 1 is 
partitioned into blocks according to the partitions of x as 

X l = [X} j \, i,j = l,..,3, 

also, X 2 is partitioned according to the dimensions of X2 
and i 3 as 

X 2 = [Xf j \, i,j = 1,2. 

The gain matrices are given by 

L 1 = (R+ B T X 1 B)- 1 B T X 1 A, 
L 2 = (R + B T X 2 B)- 1 B T X 2 A, 
L 3 = (i? 33 + B 3 X 3 B 3 )^ x B 3 X 3 A 33 . 

B. Optimal Controller Derivation 

The controllers in this case are restricted to the form 



u(t) 



fn(xi(0:t)) 
f2i{xi(0:t))+f 22 (x 2 (0:t)) 
/31MO : *)) + f 32 (x 2 (0 : <)) + f 33 (x 3 (0 : t)) 



where /y are linear functions. 

Again, the information shared among the controllers is 
xi(0 : t), and hence each controller can run an estimator 
using this piece of information. Using this idea, we will 
decompose the state and control input into two independent 
terms. Let 



x{t) = z 1 ^) + z(t), 



~Xl(t)~ 







,m = 




4(t) 







where z l (t) := 'E{x(t)\x 1 (0 : t)} and z(t) := x(t) - z 1 ^). 
Clearly, the first component of z 1 is x\, and the other 
components of z 1 and z 2 will be labeled as: 



z\t) 



The control variable will also be decomposed as 
u(t) = u 1 ^) + u 2 (t), where u 1 ^) := E{u(i)|xi(0 : 
t)} , and u 2 (t) := u(t) — u (t) are independent terms. 
Similar to the argument made for two-vehicle problem, 
since u\(t) is function of the history of x\, the first 
component of u 1 is u±. As a result u 1 and u 2 will have the 
structure: 



i 1 (t) = 



Ul(t)" 

«a(*) 







,u(t) = 


ul{t) 


u\{t) 




fis(t) 



(15) 



where 



"«!(*)" 




«3(*). 





/22(^ 2 (0 : i)) 
/32(^ 2 (0:t)) + /33(53(0:t)) 



(16) 



Similar to Lemma [2] the following recursive equations can 
be found for z 1 and z: 



z\t + 1) = Az 1 (t) + Bu 1 {t) + 
z(t + l) = Az(t) + Bu(t) + 








w 2 (t) 
w 3 (t) 



(17) 



(18) 



Since the first row in ( 18 1 is zero, this equation reduces to 





-1)" 




'A 22 " 




'4{t) 




"!). 




^4-32 ^33 




z 3 {t)_ 



~B 2 " 




w|" 




w 2 (t) 


B 3 




"3_ 


+ 


w 3 (t)_ 



This system with the information constraints stated in (I61, 
has a similar structure to the two-vehicle problem. Hence, 
the states and inputs of ( fT8] l will be decomposed in a similar 
manner. In this case z 2 (0 : t) is the information which is 
shared among the controllers, u\ and u 3 . Note that since 
controller 2 and controller 3 have access to x\ and x 2 , they 
both can construct z 2 (t) — x 2 (t) — z 2 (t) at each time step. 
We get 

z(t) = E{z(i)|z 2 (0 : t)}+z(t) - E{5(i)|z 2 (0 : t)}, 

V ^ / 1 ^ / 

z 2 (t) z 3 (t) 

u(t) = E{u{t)\z 2 (0 : t)}+u(t) - E{u(i)|z 2 (0 : t)}, 



(14) 



u 2 (t) 



«3( t ) 



The components of z 2 (t), z 3 (t), u 2 (t), and u 3 (t) can be 
labeled as 













,z 3 (t)= 





4(t)_ 




4(t)_ 










ul{t) 


,u 3 it)= 





ujit) 




ujj(f) 



u 2 it)- 



where z 3 2 (i) := E{5 3 (i)|z 2 2 (0 : t)}, zf (t) := z 3 (i) - z 2 (t), 
ul{t) := E{tt 3 (t)|u|(0 : *)}, and dj(t) := u 3 (t) - u§(*). 

Finally, we get the following orthogonal decomposition of 
the states and controls 





Xl(t)" 












x{t) = 




+ 


4(t)_ 


+ 




4(t)_ 




*l(t) 




*=>(t) 


Z 3(t) 


u(t) = 


"«l(t)~ 

«a(*) 
ulit) 


+ 




ulit) 
ulit) 


+ 





ulit) 



u 3 (t) 



zf(t + l)~ 

3 2 (i + l). 


= i 


zf(*)~ 


+B 


~uf 
ul_ 


+ 






where the dynamics of z 1 is given in (U7]i, and the dynamics 
for the non-zero component of z 2 andz 5 is given by 



(19) 



z|(t + 1) = A 33 zf (t) + S 33 «i(t) + w 3 it). 

Using Corollary [T| the cost function can be separated into 
three quadratic forms similar to the ones in ( 13 I. Minimiza- 
tion of these quadratic forms gives the optimal controllers 

as 



'«!(*)' 

ulit) 
ulit) 



-L\t) 



Xl (t) 

4® 
4(f) 



o 



L 2 it) 



4(t) 

.2 
'3 



4(t) 



(20) 






where the gain matrices, L 1 , L 2 , and L 3 are given by 

L\t) = (R+ B T X 1 it)B)- 1 B T X 1 [t)A, 
L 2 it) = (R+ B T X 2 (t)B)- 1 B T X 2 (t)A, 
L 3 it) = (i? 33 + BfX 3 it)B 3 )- 1 BjX 3 it)A 33 , 

and X 1 , X 2 , and X 3 are the solutions to the Riccati 
equations 

X\t) =A T X 1 (t+l)A + Q - A T X 1 (t + l)Bx 
(B T X 1 (t + 1)B + R)- 1 B T X 1 it + 1)A, 

X 2 it) =A T X 2 it + l)A + Q- A T X 2 it + l)Bx 
iB T X 2 it + 1)B + R)- 1 B T X 2 it + 1)1, 

X 3 it) =A^ 3 X 3 (t + 1)A 33 + Q 33 ~ Al 3 X 3 it + l)B 3 x 
iBjX 3 it + l)B 3 + R 33 y l BlX 3 it + l)A 33l 



where Q = Q[2 : 3, 2 : 3] and R = R[2 : 3, 2 : 3]. X 1 (N) = 
Q, X 2 iN) = Q[2 : 3, 2 : 3], and X 3 iN) = Q 33 . 

To find the mapping from x to u, the update equations for 
and z 2 must be obtained. By closing the 



the terms z\, 



loop in (17 1 and (19i by the corresponding controllers we 
obtain 



4(t 
4(t- 



= [A — BL 1 )^ : 3,1 : 3] 



^+l) = (i-BL 2 )[2,l:2] 



4(f) 
4(t). 

X 2 it) - 2 

z 2 3 (t) 



(21) 



(22) 



and finally the time-varying version of the controllers can be 
rewritten as 



"«!(*)" 




'xi(t)~ 

4(t) 









u* 2 it) 


= -L\t) 




L 2 it) 


'x 2 (t)- zlit) 

4(t) 


u* 3 it) 




4(t) 








L 3 it)ix 3 it)-zlit)-z 2 it)) 



(23) 



By letting N go to infinity, the controllers converge to the 
stationary form given in Theorem 2. To compute the optimal 
cost in this case, let us partition the matrix X 1 into blocks 
[Xlj], i,j — 1,2,3 in accordance with the partitions of A, 
and do the same with X 2 according to the partitions of A 
thatisX 2 = [X?],i,i = l,2. 

Then, the infinite-horizon optimal cost is obtained by 



N-l 



lim — V (Tr(Xl it 



t=0 



+TriX 3 it + 



- l)Wx) - 
l)W 3 )) 



Tr^^+l)^)) 



Tr(X^W 2 ) + Tr{X 6 W 3 ). 

V. NUMERICAL RESULTS 

In this section, we implement the proposed controller 
on an M = 3 HDV platoon (Figure [T} and evaluate the 
performance through a realistic scenario that HDV platoons 
often face on the road. We assume that the vehicles in the 
platoon can only measure the velocity and relative distance of 
the preceding vehicle and only receive information through 
wireless communication of all the preceding vehicles. This 
assumption is made to evaluate if a small addition in commu- 
nication links could improve the system performance. Hence, 
a numerical comparison is made between the proposed 
controller and a suboptimal controller specifically designed 
for HDV platooning [1]. The suboptimal controller uses local 
information, namely it only accounts for the dynamics of the 
preceding vehicle. Finally, we compare the proposed con- 
troller with the fully centralized linear quadratic controller. 

When studying the behavior of vehicles within a finite 
platoon, the velocity does not deviate significantly from 
the lead vehicle's velocity trajectory. The control strategy 
is simply to provide an input that maintains the platoon 
velocity at a set relative distance. In practice, many random 
disturbances such as wind variation, changing topology, or 



varying road properties are inflicted upon the system. These 
disturbances are modeled as disturbances in state measure- 
ments. An additional disturbance of interest is a mandated 
deviation in the lead vehicles velocity. This often occurs due 
to varying traffic events that the lead vehicle must adhere 
to. Hence, integral action for the lead vehicle is also added 
as a state to the system presented in to model such 
disturbances. The system for a M = 3 HDV platoon can 
thereby be grouped into sub-blocks, as in Q, for controller 
design, where 



Bi 



- 
1 



fan 



A; 



-1 



i ^.i(i-l) 



Bo 



1 


2,3. 



The modeled HDVs are described as traveling in a lon- 
gitudinal direction on a flat road. We consider a heteroge- 
neous platoon, where the masses are set to [mi,m2,mz\ — 
[30000, 40000, 30000] kg. All the vehicles are assumed to 
be traveling in the steady state velocity vq = 19.44 m/s 
(70km/h) at time gap r = 1 s, which gives an intermediate 
distance of e?o = 19.44. The maximum engine and braking 
torque for a commercial HDV varies based upon vehicle 
configuration but can be approximated to be 2500 Nm and 
60000 Nm/Axle respectively. 

State disturbances as well as several lead vehicle deviation 
disturbances are imposed on the system, see Figure [3] The 
lead vehicle deviation disturbances can be explained by the 
following scenario. The platoon travels along a road where 
the road speed is 70km/h. Suddenly a slower vehicle enters 
the lane through a shoulder path (at the 45 s time marker). 
The lead vehicle must therefore reduce its speed to 60 km/h, 
in turn forcing the follower vehicles to reduce their speed 
and adapt their relative distance accordingly. After a while, 
the slower vehicle increases its speed to the road speed 
of 70 km/h and no longer inhibits the platoon (120s time 
marker). Hence, the lead vehicle again resumes the road 
speed and the follower vehicles adapt the speed and distance 
automatically as well. Finally, the platoon arrives at a point 
where the road speed is changed to 80 km/h (180s time 
marker). 

Figure [3] shows the velocity trajectories of three HDV 
platoon in the top plot and the corresponding intermediate 
spacings in this scenario. The trajectories obtained through 
the optimal decentralized controller are bold. The trajectories 
are also plotted, with thinner lines, for the suboptimal 
decentralized controller. We see that the proposed optimal 
controller displays a good performance. The suboptimal 
controller displays a slightly harsher behavior with a faster 
speed change, since it does not take follower vehicles into 
account. Hence, the required relative control input energy 
is much higher for the suboptimal controller compared to 
our proposed controller, as can been seen in Figure [4] The 
first two rows in Table [I] states the total control input energy 
required to handle the imposed disturbances. We can see 
that the optimal distributed controller reduces the control 
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Fig. 3. Three HDV platoon, where a disturbance in velocity of the lead 
vehicle is imposed. The top plot shows the velocity trajectories for the 
M = 3 HDV platoon and the bottom plot shows the intermediate spacings. 
The trajectories obtained through the optimal decentralized controller are 
bold and subindexed with i, D and the trajectories obtained through the 
suboptimal controller are subindexed with i,S, where i = 1,2,3 denote 
the platoon position index. 
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Fig. 4. Corresponding input torque to handle the imposed disturbances 
in Figure [3] Similarly, the trajectories obtained through the optimal decen- 
tralized controller are bold and subindexed with i, D and the trajectories 
obtained through the suboptimal controller are subindexed with i,S, i = 
1,2,3. 



input energy by 10.4% for the lead vehicle, by 16.3% for 
the second vehicle, and by 15.5 % for the third vehicle. By 
estimating the states of the follower vehicles, the proposed 
controller mimics a centralized control strategy and displays 
a smoother behavior. However, the reduced control energy 
is obtained at the cost of adding a communication link 
between vehicle 1 and vehicle 3, since vehicle l's state 
cannot be measured through the mounted radar on vehicle 3. 
Furthermore, the average velocity is reduced by 2.6 %. Travel 
time is equally important for fleet operators. However, it 
is clear that there is a considerable saving in the fuel 
consumption at the cost of additional communication links 
and a much smaller reduction in travel time. 

Furthermore, the results in the last four rows of Ta- 
ble [I] show that the required control input to handle the 
disturbances are well within the feasible physical range. 



TABLE I 

Table of the required control input (Torque) to handle the 
disturbances in figure0 



i 


1 


2 


3 


\\ u i,D\ 


2 [kNm] 


81.9 


100.1 


80.8 




2 [kNm] 


91.4 


112.6 


89.0 


U i,D 


[kNm] 


1.41 


1.76 


1.37 


..max 
u i,S 


[kNm] 


1.92 


2.39 


1.8 


^.min 
U i,D 


[kNm] 


-0.79 


-1.09 


-0.75 


u i,S 


[kNm] 


-1.3 


-1.72 


-1.19 



The heaviest vehicle with platoon index i = 2 naturally 
requires the largest control input to handle the disturbances. 
However, it is also seemingly where the highest relative 
difference in cost is obtained. By estimating the states of 
the follower vehicles, the mandated control input to handle 
the presented disturbances can be reduced significantly. Both 
the maximum and minimum values are lowered in the control 
input requirement for the optimal decentralized controller. 

The optimal decentralized controller was also compared 
with a centralized control strategy. Since the proposed con- 
troller also accounts for all the states in the platoon by 
estimating the states of the follower vehicles, the behavior 
is close to the centralized controller. The computed relative 
differences in the cost function as well as the difference 
in required control inputs to handle the disturbances are 
minimal. 

VI. CONCLUSIONS 

We have presented a quadratic optimal distributed control 
method for chain structures with applications to heteroge- 
neous vehicle platooning under communication constraints. 
A procedure has been given for constructing low order op- 
timal decentralized controllers through a simple decomposi- 
tion scheme. A discrete HDV platoon model has been derived 
that includes physical coupling between the vehicles upon 
which the controllers are evaluated. The results show that the 
total control input energy required for the proposed controller 
is very close to a centralized controller where communication 
is needed among all the vehicles, and is significantly lower 
compared to a suboptimal controller which only accounts for 
the immediate preceding vehicle. In particular, by estimating 
the interaction with the follower vehicles, performance can 
be improved by adding a communication link from the first to 
the third vehicle in a three-vehicle platoon. Thus, considering 
preceding vehicles as well as follower vehicles is significant 
for fuel optimality. 

A natural extension to the presented work is to derive ex- 
plicit solutions for the problem of Af-HDVs. Also, it would 
be interesting to consider time delays in the communication 
links between the vehicles. It is planned for future work. 
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VII. APPENDIX 



Proof of Lemma 2: 

We have z x (t + 1) = E{x(t + l)|xi(0 : t + 1)} = 
E{x(t + l)|cci(0 : t),x\(t+ 1)}. To evaluate the conditional 
expectation given xi(0 : t) and + 1), we change the 
variables so that we get independent variables. Note that 
we can construct x\{t + 1) = AuXi(t) + BiUi{t) + w(t) 
given that x\(0 : t) and wi(t) are available. Hence, instead 
of evaluating the conditional expectation of x(t + 1) given 
xi(0 : t) and xi(t + 1), we will evaluate the expectation 
given the independent variables xi(0 : t) and wi(t). In other 
words, wi(t) is part of xi(t + 1) which was not previously 
available in xi(0 : t). Thus we find that, 

z 1 {t + l) =E{Ax{t) + Bu{t) +w{t)\x±{Q : t),wi(t)} 
=~E{Ax(t) + Bu{t) + w(t)\ Xl {Q : t)} 
+~E{Ax{t) + Bu{t) + w{t)\ Wl {t)} 





--Az 1 {t) + Bu 1 ^) + 



where we also used the pairwise independence of Wi(t), 
u>2(t), x(t), and u(t). So, z 2 is given by 

z 2 (t + 1) = x(t + 1) - E{x(t + l)|a;i(0 :t+l)} 





Az 2 {t) + Bu 2 (t) + 



w 2 (t) 



